Sample Path Optimal Policies for Serial Lines with Flexible Workers
نویسندگان
چکیده
منابع مشابه
Convergence of Sample Path Optimal Policies for Stochastic Dynamic Programming
We consider the solution of stochastic dynamic programs using sample path estimates. Applying the theory of large deviations, we derive probability error bounds associated with the convergence of the estimated optimal policy to the true optimal policy, for finite horizon problems. These bounds decay at an exponential rate, in contrast with the usual canonical (inverse) square root rate associat...
متن کاملPath-clearing policies for flexible manufacturing systems
In practical manufacturing settings it is often possible to obtain, in real-time, information about the operation of several machines in a flexible manufacturing system (FMS) that can be quite useful in scheduling part flows. In this brief paper the authors introduce some scheduling policies that can effectively utilize such information (something the policies in [1] do not do) and they provide...
متن کاملLearning near-optimal policies with fitted policy iteration and a single sample path
In this paper we consider the problem of learning a near-optimal policy in continuous-space, expected total discounted-reward Markovian Decision Problems using approximate policy iteration. We consider batch learning where the training data consists of a single sample path of a fixed, known, persistently-exciting stationary stochastic policy. We derive PAC-style bounds on the difference of the ...
متن کاملSample-Path Optimal Stationary Policies in Stable Markov Decision Chains with the Average Reward Criterion
Abstract. This work concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function `. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that, if the expected average reward associated to ...
متن کاملOptimal Hiring and Retention Policies for Heterogeneous Workers Who Learn
W study the hiring and retention of heterogeneous workers who learn over time. We show that the problem can be analyzed as an infinite-armed bandit with switching costs, and we apply results from Bergemann and Välimäki [Bergemann D, Välimäki J (2001) Stationary multi-choice bandit problems. J. Econom. Dynam. Control 25(10):1585–1594] to characterize the optimal hiring and retention policy. For ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Applied Probability
سال: 2012
ISSN: 0021-9002,1475-6072
DOI: 10.1017/s0021900200009281